Learning Speech-Based Video Concept Models Using WordNet

نویسندگان

  • Xiaodan Song
  • Ching-Yung Lin
  • Ming-Ting Sun
چکیده

Modeling concepts using supervised or unsupervised machine learning approaches are becoming more and more important for video semantic indexing, retrieval and filtering applications. Naturally, videos include multimodality audio, speech, visual and text data, that are combined to inferred therein the overall semantic concepts. However, in literature, most researches were mostly conducted within only one single domain. In this paper we propose an unsupervised technique that builds context-independent keyword lists for desired speech-based concept modeling from WordNet. Furthermore, we propose an extended speech-based video concept (ESVC) model to reorder and extend the above keyword lists by supervised learning based on multimodality annotation. Experimental results show that the context-independent models can achieve comparable performance to conventional supervised learning algorithms, and the ESVC model achieves about 53% and 28.4% relative improvement in two testing subsets of the TRECVID 2003 corpus over a prior state-of-the-art speech-based video concept detection algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Concept Learning and Categorization from the Web

In previous work, we found that a great deal of information about noun attributes can be extracted from the Web using simple text patterns, and that enriching vector-based models of concepts with this information about attributes led to drastic improvements in noun categorization. We extend this previous work in two ways: (i) by comparing concept descriptions extracted using patterns with descr...

متن کامل

TREC 2003 Video Retrieval and Story Segmentation Task at NUS PRIS

This paper describes the details of our systems for story segmentation task and search task of the TREC-2003 Video Track. In story segmentation task, we propose a two-level multi-modal framework. First we analyze the video at the shot level using a variety of low and high-level features, and classify the shots into pre-defined categories using a Decision Tree. Next we perform HMM analysis in or...

متن کامل

Associating Collocations with WordNet Senses Using Hybrid Models

In this paper, we introduce a hybrid method to associate English collocations with sense class members chosen from WordNet. Our combinational approach includes a learning-based method, a paraphrase-based method and a sense frequency ranking method. At training time, a set of collocations with their tagged senses is prepared. We use the sentence information extracted from a large corpus and cros...

متن کامل

A New WordNet Enriched Content-Collaborative Recommender System

The recommender systems are models that are to predict the potential interests of users among a number of items. These systems are widespread and they have many applications in real-world. These systems are generally based on one of two structural types: collaborative filtering and content filtering. There are some systems which are based on both of them. These systems are named hybrid recommen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005